-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1869388 add memoization to to_selectable #2815
SNOW-1869388 add memoization to to_selectable #2815
Conversation
tests/unit/conftest.py
Outdated
@@ -79,6 +79,7 @@ def mock_session(mock_analyzer) -> Session: | |||
fake_session = mock.create_autospec(Session) | |||
fake_session._cte_optimization_enabled = False | |||
fake_session._analyzer = mock_analyzer | |||
fake_session._lock = mock.MagicMock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a more descriptive name to this lock -- similar to _plan_lock
below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this lock is actually part of Session
object and used as a general purpose lock for serialization:
self._lock = create_rlock(self._conn._thread_safe_session_enabled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
selectable._is_valid_for_replacement = ( | ||
snowflake_plan._is_valid_for_replacement | ||
) | ||
self._to_selectable_memo_dict[snowflake_plan._uuid] = selectable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably should remember the original logical plan id instead of the resolved plan id, because the different snowflake plan might be created for the same orignal plan node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea it's true
return self._to_selectable_memo_dict[plan._uuid] | ||
snowflake_plan = self.resolve(plan) | ||
selectable = SelectSnowflakePlan(snowflake_plan, analyzer=self) | ||
selectable._is_valid_for_replacement = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact, it should always be true in query generator instead of propogating from the snowlfake plan
selectable = SelectSnowflakePlan(snowflake_plan, analyzer=self) | ||
selectable._is_valid_for_replacement = snowflake_plan._is_valid_for_replacement | ||
return selectable | ||
with self.session._lock: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a lock here, but not at other place in query_generator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need it. query generator is not shared between threads so we should be good. That's why it is not added in other places.
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-1869388
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
The process of adding WithQueryBlocks during repeated subquery elimination optimization takes place as follows:
When parents for the WithQueryBlocks are updated, if the direct parent is a Selectable, we convert the SnowflakePlan into a Selectable using to_selectable. In case the WithQueryBlock has multiple parents, which by construction will always be true, a new Selectable object is created for each parent. See example below where new selectable nodes are highlighted in gray:
When large query block is applied on this plan, and the WithQueryBlock is chosen at eligible, since all these blocks have same statistics, they will all be eligible together and possible be materialized we create redundant materializations. The correct construction for such a case should look like the plan shown below: